IRIT at INEX 2012: Tweet Contextualization

نویسندگان

  • Liana Ermakova
  • Josiane Mothe
چکیده

In this paper, we describe an approach for tweet contextualization developed in the context of the INEX 2012. The task was to provide a context up to 500 words to a tweet from the Wikipedia. As a baseline system, we used TF-IDF cosine similarity measure enriched by smoothing from local context, named entity recognition and part-of-speech weighting presented at INEX 2011. We modified this method by adding bigram similarity, anaphora resolution, hashtag processing and sentence reordering. Sentence ordering task was modeled as a sequential ordering problem, where vertices corresponded to sentences and sequential constraints were represented by sentence time stamps.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IRIT at INEX 2013: Tweet Contextualization Track

The paper presents IRIT’s approach used at INEX Tweet Contextualization Track 2013. Systems had to provide a context to a tweet. This year we further modified our approach presented at INEX 2011 and 2012 underlain by the product of scores based on hashtag processing, TF-IDF cosine similarity measure enriched by smoothing from local context and document beginning, named entity recognition and pa...

متن کامل

Two Statistical Summarizers at INEX 2012 Tweet Contextualization Track

According to the organizers, the objective of the 2012 INEX Tweet Contextualization Task is: “...given a tweet, the system must provide some context about the subject of the tweet, in order to help the reader to understand it. This context should take the form of a readable (and short) summary, composed of passages from [...] Wikipedia.” We present summarizers Cortex and KL-summ applied to the ...

متن کامل

IRIT at INEX: Question Answering Task

In this paper we describe an approach to tweet contextualization developed in the context of INEX QA track. The task is to provide a context up to 500 words to a tweet. The summary should be an extract from the Wikipedia. Our approach is based on the index which includes not only lemmas, but also named entities. Sentence retrieval is based on standard TF-IDF measure enriched by named entity rec...

متن کامل

The 2012 INEX Snippet and Tweet Contextualization Tasks

This paper reports on our current experiments involving the Snippet and Tweet Contextualization Tracks of the 2012 INEX competition. Most of this work in snippet generation extends our earlier (2011) approach, described in [4], which produced a top-ranked result. The source of the snippet in these experiments is the top-ranked focused element(s) of the document in question. Another approach is ...

متن کامل

An Automatic Greedy Summarization System at INEX 2013 Tweet Contextualization Track

According to the organizers, the aim of the 2013 INEX Tweet Contextualization Track is: “...given a tweet, the system must provide some context about the subject of the tweet, in order to help the reader to understand it. This context should take the form of a readable (and short) summary, composed of passages from [...] Wikipedia.” We present an automatic greedy summarizer named REG applied to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012